118 research outputs found

    Efficient Egocentric Visual Perception Combining Eye-tracking, a Software Retina and Deep Learning

    Get PDF
    We present ongoing work to harness biological approaches to achieving highly efficient egocentric perception by combining the space-variant imaging architecture of the mammalian retina with Deep Learning methods. By pre-processing images collected by means of eye-tracking glasses to control the fixation locations of a software retina model, we demonstrate that we can reduce the input to a DCNN by a factor of 3, reduce the required number of training epochs and obtain over 98% classification rates when training and validating the system on a database of over 26,000 images of 9 object classes.Comment: Accepted for: EPIC Workshop at the European Conference on Computer Vision, ECCV201

    Recognising the Clothing Categories from Free-Configuration Using Gaussian-Process-Based Interactive Perception

    Get PDF
    In this paper, we propose a Gaussian Process- based interactive perception approach for recognising highly- wrinkled clothes. We have integrated this recognition method within a clothes sorting pipeline for the pre-washing stage of an autonomous laundering process. Our approach differs from reported clothing manipulation approaches by allowing the robot to update its perception confidence via numerous interactions with the garments. The classifiers predominantly reported in clothing perception (e.g. SVM, Random Forest) studies do not provide true classification probabilities, due to their inherent structure. In contrast, probabilistic classifiers (of which the Gaussian Process is a popular example) are able to provide predictive probabilities. In our approach, we employ a multi-class Gaussian Process classification using the Laplace approximation for posterior inference and optimising hyper-parameters via marginal likelihood maximisation. Our experimental results show that our approach is able to recognise unknown garments from highly-occluded and wrinkled con- figurations and demonstrates a substantial improvement over non-interactive perception approaches

    A Biologically Motivated Software Retina for Robotic Sensors for ARM-Based Mobile Platform Technology

    Get PDF
    A key issue in designing robotics systems is the cost of an integrated camera sensor that meets the bandwidth/processing requirement for many advanced robotics applications, especially lightweight robotics applications, such as visual surveillance or SLAM in autonomous aerial vehicles. There is currently much work going on to adapt smartphones to provide complete robot vision systems, as the smartphone is so exquisitely integrated by having camera(s), inertial sensing, sound I/O and excellent wireless connectivity. Mass market production makes this a very low-cost platform and manufacturers from quadrotor drone suppliers to children’s toys, such as the Meccanoid robot [5], employ a smartphone to provide a vision system/control system [7,8]. Accordingly, many research groups are attempting to optimise image analysis, computer vision and machine learning libraries for the smartphone platform. However current approaches to robot vision remain highly demanding for mobile processors such as the ARM, and while a number of algorithms have been developed, these are very stripped down, i.e. highly compromised in function or performance. For example, the semi-dense visual odometry implementation of [1] operates on images of only 320x240pixels. In our research we have been developing biologically motivated foveated vision algorithms based on a model of the mammalian retina [2], potentially 100 times more efficient than their conventional counterparts. Accordingly, vision systems based on the foveated architectures found in mammals have also the potential to reduce bandwidth and processing requirements by about x100 - it has been estimated that our brains would weigh ~60Kg if we were to process all our visual input at uniform high resolution. We have reported a foveated visual architecture [2,3,4] that implements a functional model of the retina-visual cortex to produce feature vectors that can be matched/classified using conventional methods, or indeed could be adapted to employ Deep Convolutional Neural Nets for the classification/interpretation stage. Given the above processing/bandwidth limitations, a viable way forward would be to perform off-line learning and implement the forward recognition path on the mobile platform, returning simple object labels, or sparse hierarchical feature symbols, and gaze control commands to the host robot vision system and controller. We are now at the early stages of investigating how best to port our foveated architecture onto an ARM-based smartphone platform. To achieve the required levels of performance we propose to port and optimise our retina model to the mobile ARM processor architecture in conjunction with their integrated GPUs. We will then be in the position to provide a foveated smart vision system on a smartphone with the advantage of processing speed gains and bandwidth optimisations. Our approach will be to develop efficient parallelising compilers and perhaps propose new processor architectural features to support this approach to computer vision, e.g. efficient processing of hexagonally sampled foveated images. Our current goal is to have a foveated system running in real-time on at least a 1080p input video stream to serve as a front-end robot sensor for tasks such as general purpose object recognition and reliable dense SLAM using a commercial off-the-shelf smartphone. Initially this system would communicate a symbol stream to conventional hardware performing back-end visual classification/interpretation, although simple object detection and recognition tasks should be possible on-board the device. We propose that, as in Nature, foveated vision is the key to achieving the necessary data reduction to be able to implement complete visual recognition and learning processes on the smartphone itself

    Efficient Egocentric Visual Perception Combining Eye-tracking, a Software Retina and Deep Learning

    Get PDF
    We present ongoing work to harness biological approaches to achieving highly efficient egocentric perception by combining the space- variant imaging architecture of the mammalian retina with Deep Learn- ing methods. By pre-processing images collected by means of eye-tracking glasses to control the fixation locations of a software retina model, we demonstrate that we can reduce the input to a DCNN by a factor of 3, reduce the required number of training epochs and obtain over 98% clas- sification rates when training and validating the system on a database of over 26,000 images of 9 object classes

    A space-variant visual pathway model for data efficient deep learning

    Get PDF
    We present an investigation into adopting a model of the retino-cortical mapping, found in biological visual systems, to improve the efficiency of image analysis using Deep Convolutional Neural Nets (DCNNs) in the context of robot vision and egocentric perception systems. This work has now enabled DCNNs to process input images approaching one million pixels in size, in real time, using only consumer grade graphics processor (GPU) hardware in a single pass of the DCNN

    Smart Visual Sensing Using a Software Retina Model

    Get PDF
    We present an approach to efficient visual sensing and perception based on a non-uniformly sampled, biologically inspired, software retina that when combined with a DCNN classifier has enabled megapixel-sized camera input images to be processed in a single pass, while maintaining state-of-the recognition performance

    Deep reinforcement learning control of hand-eye coordination with a software retina

    Get PDF
    Deep Reinforcement Learning (DRL) has gained much attention for solving robotic hand-eye coordination tasks from raw pixel values. Despite promising results, training agents using images is hardware intensive often requiring millions of training steps to converge incurring long training times and increased risk of wear and tear on the robot. To speed up training, images are often cropped and downscaled resulting in a smaller field of view and loss of valuable high-frequency data. In this paper, we propose training the vision system using supervised learning prior to training robotic actuation using Deep Deterministic Policy Gradient (DDPG). The vision system uses a software retina, based on the mammalian retino-cortical transform, to preprocess full-size images to compress image data while preserving the full field of view and high-frequency visual information around the fixation point prior to processing by a Deep Convolutional Neural Network (DCNN) to extract visual state information. Using the vision system to preprocess the environment improves the agent's sample complexity and network update speed leading to significantly faster training with reduced image data loss. Our method is used to train a DRL system to control a real Baxter robot's arm, processing full-size images captured by an in-wrist camera to locate an object on a table and centre the camera over it by actuating the robot arm

    Spatiotemporal mortality and demographic trends in a small cetacean: Strandings to inform conservation management

    Get PDF
    With global increases in anthropogenic pressures on wildlife populations comes a responsibility to manage them effectively. The assessment of marine ecosystem health is challenging and often relies on monitoring indicator species, such as cetaceans. Most cetaceans are however highly mobile and spend the majority of their time hidden from direct view, resulting in uncertainty on even the most basic population metrics. Here, we discuss the value of long-term and internationally combined stranding records as a valuable source of information on the demographic and mortality trends of the harbour porpoise (Phocoena phocoena) in the North Sea. We analysed stranding records (n = 16,181) from 1990 to 2017 and demonstrate a strong heterogeneous seasonal pattern of strandings throughout the North Sea, indicative of season-specific distribution or habitat use, and season-specific mortality. The annual incidence of strandings has increased since 1990, with a notable steeper rise particularly in the southern North Sea since 2005. A high density of neonatal strandings occurred specifically in the eastern North Sea, indicative of areas important for calving, and large numbers of juvenile males stranded in the southern parts, indicative of a population sink or reflecting higher male dispersion. These findings highlight the power of stranding records to detect potentially vulnerable population groups in time and space. This knowledge is vital for managers and can guide, for example, conservation measures such as the establishment of time-area-specific limits to potentially harmful human activities, aiming to reduce the number and intensity of human-wildlife conflicts

    The stranding anomaly as population indicator: the case of Harbour Porpoise <i>Phocoena phocoena</i> in North-Western Europe

    Get PDF
    Ecological indicators for monitoring strategies are expected to combine three major characteristics: ecological significance, statistical credibility, and cost-effectiveness. Strategies based on stranding networks rank highly in cost-effectiveness, but their ecological significance and statistical credibility are disputed. Our present goal is to improve the value of stranding data as population indicator as part of monitoring strategies by constructing the spatial and temporal null hypothesis for strandings. The null hypothesis is defined as: small cetacean distribution and mortality are uniform in space and constant in time. We used a drift model to map stranding probabilities and predict stranding patterns of cetacean carcasses under H-0 across the North Sea, the Channel and the Bay of Biscay, for the period 1990-2009. As the most common cetacean occurring in this area, we chose the harbour porpoise <i>Phocoena phocoena</i> for our modelling. The difference between these strandings expected under H-0 and observed strandings is defined as the stranding anomaly. It constituted the stranding data series corrected for drift conditions. Seasonal decomposition of stranding anomaly suggested that drift conditions did not explain observed seasonal variations of porpoise strandings. Long-term stranding anomalies increased first in the southern North Sea, the Channel and Bay of Biscay coasts, and finally the eastern North Sea. The hypothesis of changes in porpoise distribution was consistent with local visual surveys, mostly SCANS surveys (1994 and 2005). This new indicator could be applied to cetacean populations across the world and more widely to marine megafauna
    corecore